Skip to content

Conversation

@philljj
Copy link
Contributor

@philljj philljj commented Jan 26, 2026

Description

Support kernel-benchmarks with bsdkm:

  • --enable-kernel-benchmarks --enable-freebsdkm

Support bsdkm x86 crypto acceleration:

  • --enable-aesni
  • --enable-aesni-with-avx
  • --enable-sp-asm

Register wolfCrypt implementations with the FreeBSD kernel opencrypto framework:

  • --enable-freebsdkm-crypto-register

notes:

  • crypto-register is to facilitate testing, not considered a full feature for now.
  • enable-sp-asm with enable-ecc not supported.

Testing

normal module

./configure --enable-freebsdkm --enable-cryptonly \
  --enable-crypttests --enable-all-crypto --enable-kernel-benchmarks \
  --enable-asm --enable-aesni --enable-aesni-with-avx --enable-sp-asm --disable-ecc

make && sudo kldload bsdkm/libwolfssl.ko

crypto-register

  1. note: this test requires openssl11 cryptoengine offload to kernel. Install with
sudo pkg install openssl111-1.1.1w_2
  1. Build and load libwolfssl.ko:
./configure --enable-freebsdkm --enable-cryptonly --enable-debug \            
  --enable-crypttests --enable-freebsdkm-crypto-register \                    
  --enable-asm --enable-intelasm --enable-aesni --enable-sp-asm --disable-ecc \
  --enable-aesgcm-stream \                                                    
  CFLAGS="$BASE_CFLAGS -DWOLFSSL_BSDKM_FPU_DEBUG"

make && sudo kldload bsdkm/libwolfssl.ko
  1. Load cryptodev
sudo kldload cryptodev
  1. Run this script to fork userspace openssl speed tests, which will saturate all cpus and generate a lot of system log activity.

  2. Unload cryptodev first:

sudo kldunload cryptodev libwolfssl

You'll see cleanup FPU debug logging like:

libwolf0: info: exiting freesession
info: libwolfssl 5.8.4 cleanup complete.
info: wolfkmod_vecreg_exit: fpu_states[0] = 0, 0
info: wolfkmod_vecreg_exit: fpu_states[1] = 0, 0
info: wolfkmod_vecreg_exit: fpu_states[2] = 0, 0
info: wolfkmod_vecreg_exit: fpu_states[3] = 0, 0
libwolf0: info: crid unregistered: 2
libwolf0: info: exiting detach
libwolf0: detached

Benchmarks

--enable-asm --enable-intelasm --enable-aesni --enable-aesni-with-avx --enable-sp-asm --disable-ecc

Math: 	Multi-Precision: Wolf(SP) word-size=64 bits=4096 sp_int.c
	Single Precision: rsa/dh 2048 3072 4096 asm sp_x86_64.c
	Assembly Speedups: INTELASM ALIGN X86_64_BUILD
wolfCrypt Benchmark (block bytes 1048576, min 1.0 sec each)
RNG                      65.0 MiB took 1.000 seconds, 65.000 MiB/s Cycles per byte = 19.44
AES-128-CBC-enc          1280.0 MiB took 1.000 seconds, 1280.000 MiB/s Cycles per byte = 2.48
AES-128-CBC-dec          5580.0 MiB took 1.000 seconds, 5580.000 MiB/s Cycles per byte = 0.58
AES-192-CBC-enc          1090.0 MiB took 1.000 seconds, 1090.000 MiB/s Cycles per byte = 2.98
AES-192-CBC-dec          4605.0 MiB took 1.000 seconds, 4605.000 MiB/s Cycles per byte = 0.71
AES-256-CBC-enc          940.0 MiB took 1.000 seconds, 940.000 MiB/s Cycles per byte = 3.47
AES-256-CBC-dec          3885.0 MiB took 1.000 seconds, 3885.000 MiB/s Cycles per byte = 0.83
AES-128-GCM-enc          5505.0 MiB took 1.000 seconds, 5505.000 MiB/s Cycles per byte = 0.59
AES-128-GCM-dec          5275.0 MiB took 1.000 seconds, 5275.000 MiB/s Cycles per byte = 0.62
AES-192-GCM-enc          4280.0 MiB took 1.000 seconds, 4280.000 MiB/s Cycles per byte = 0.76
AES-192-GCM-dec          4660.0 MiB took 1.000 seconds, 4660.000 MiB/s Cycles per byte = 0.70
AES-256-GCM-enc          3790.0 MiB took 1.000 seconds, 3790.000 MiB/s Cycles per byte = 0.86
AES-256-GCM-dec          3635.0 MiB took 1.000 seconds, 3635.000 MiB/s Cycles per byte = 0.89
AES-128-GCM-enc-no_AAD   4880.0 MiB took 1.000 seconds, 4880.000 MiB/s Cycles per byte = 0.67
AES-128-GCM-dec-no_AAD   5340.0 MiB took 1.000 seconds, 5340.000 MiB/s Cycles per byte = 0.61
AES-192-GCM-enc-no_AAD   4630.0 MiB took 1.000 seconds, 4630.000 MiB/s Cycles per byte = 0.70
AES-192-GCM-dec-no_AAD   4510.0 MiB took 1.000 seconds, 4510.000 MiB/s Cycles per byte = 0.72
AES-256-GCM-enc-no_AAD   3855.0 MiB took 1.000 seconds, 3855.000 MiB/s Cycles per byte = 0.84
AES-256-GCM-dec-no_AAD   3960.0 MiB took 1.000 seconds, 3960.000 MiB/s Cycles per byte = 0.82
GMAC Table 4-bit         1635.0 MiB took 1.000 seconds, 1635.000 MiB/s Cycles per byte = 1.99
CHACHA                   3160.0 MiB took 1.000 seconds, 3160.000 MiB/s Cycles per byte = 1.03
CHA-POLY                 2110.0 MiB took 1.000 seconds, 2110.000 MiB/s Cycles per byte = 1.54
POLY1305                 6380.0 MiB took 1.000 seconds, 6380.000 MiB/s Cycles per byte = 0.51
SHA                      545.0 MiB took 1.000 seconds, 545.000 MiB/s Cycles per byte = 6.02
SHA-224                  450.0 MiB took 1.000 seconds, 450.000 MiB/s Cycles per byte = 7.20
SHA-256                  455.0 MiB took 1.000 seconds, 455.000 MiB/s Cycles per byte = 7.16
SHA-384                  670.0 MiB took 1.000 seconds, 670.000 MiB/s Cycles per byte = 4.81
SHA-512                  675.0 MiB took 1.000 seconds, 675.000 MiB/s Cycles per byte = 4.84
SHA-512/224              670.0 MiB took 1.000 seconds, 670.000 MiB/s Cycles per byte = 4.83
SHA-512/256              685.0 MiB took 1.000 seconds, 685.000 MiB/s Cycles per byte = 4.75
SHA3-224                 385.0 MiB took 1.000 seconds, 385.000 MiB/s Cycles per byte = 8.48
SHA3-256                 355.0 MiB took 1.000 seconds, 355.000 MiB/s Cycles per byte = 9.10
SHA3-384                 275.0 MiB took 1.000 seconds, 275.000 MiB/s Cycles per byte = 11.92
SHA3-512                 190.0 MiB took 1.000 seconds, 190.000 MiB/s Cycles per byte = 17.15
HMAC-SHA                 530.0 MiB took 1.000 seconds, 530.000 MiB/s Cycles per byte = 6.08
HMAC-SHA224              445.0 MiB took 1.000 seconds, 445.000 MiB/s Cycles per byte = 7.30
HMAC-SHA256              450.0 MiB took 1.000 seconds, 450.000 MiB/s Cycles per byte = 7.26
HMAC-SHA384              670.0 MiB took 1.000 seconds, 670.000 MiB/s Cycles per byte = 4.81
HMAC-SHA512              670.0 MiB took 1.000 seconds, 670.000 MiB/s Cycles per byte = 4.87
PBKDF2                   44.0 KiB took 1.000 seconds, 44.125 KiB/s Cycles per byte = 75036.99
RSA     2048   public     55800 ops took 1.000 sec, avg 0.018 ms, 55800.000 ops/sec, 3407511926 cycles 61066.5 Cycles/op
RSA     2048  private      1900 ops took 1.000 sec, avg 0.526 ms, 1900.000 ops/sec, 3521780094 cycles 1853568.5 Cycles/op
DH      2048  key gen      3633 ops took 1.000 sec, avg 0.275 ms, 3633.000 ops/sec, 3292563856 cycles 906293.4 Cycles/op
DH      2048    agree      3800 ops took 1.000 sec, avg 0.263 ms, 3800.000 ops/sec, 3434799514 cycles 903894.6 Cycles/op
Benchmark complete

Same but with --disable-asm:

Math: 	Multi-Precision: Wolf(SP) word-size=64 bits=4096 sp_int.c
wolfCrypt Benchmark (block bytes 1048576, min 1.0 sec each)
RNG                      90.0 MiB took 1.000 seconds, 90.000 MiB/s Cycles per byte = 28.81
AES-128-CBC-enc          275.0 MiB took 1.000 seconds, 275.000 MiB/s Cycles per byte = 11.65
AES-128-CBC-dec          300.0 MiB took 1.000 seconds, 300.000 MiB/s Cycles per byte = 10.82
AES-192-CBC-enc          240.0 MiB took 1.000 seconds, 240.000 MiB/s Cycles per byte = 13.50
AES-192-CBC-dec          250.0 MiB took 1.000 seconds, 250.000 MiB/s Cycles per byte = 12.87
AES-256-CBC-enc          210.0 MiB took 1.000 seconds, 210.000 MiB/s Cycles per byte = 15.50
AES-256-CBC-dec          220.0 MiB took 1.000 seconds, 220.000 MiB/s Cycles per byte = 15.02
AES-128-GCM-enc          150.0 MiB took 1.000 seconds, 150.000 MiB/s Cycles per byte = 21.45
AES-128-GCM-dec          155.0 MiB took 1.000 seconds, 155.000 MiB/s Cycles per byte = 21.40
AES-192-GCM-enc          135.0 MiB took 1.000 seconds, 135.000 MiB/s Cycles per byte = 23.30
AES-192-GCM-dec          145.0 MiB took 1.000 seconds, 145.000 MiB/s Cycles per byte = 22.99
AES-256-GCM-enc          125.0 MiB took 1.000 seconds, 125.000 MiB/s Cycles per byte = 25.39
AES-256-GCM-dec          130.0 MiB took 1.000 seconds, 130.000 MiB/s Cycles per byte = 25.42
AES-128-GCM-enc-no_AAD   150.0 MiB took 1.000 seconds, 150.000 MiB/s Cycles per byte = 21.27
AES-128-GCM-dec-no_AAD   155.0 MiB took 1.000 seconds, 155.000 MiB/s Cycles per byte = 21.18
AES-192-GCM-enc-no_AAD   140.0 MiB took 1.000 seconds, 140.000 MiB/s Cycles per byte = 23.09
AES-192-GCM-dec-no_AAD   140.0 MiB took 1.000 seconds, 140.000 MiB/s Cycles per byte = 23.28
AES-256-GCM-enc-no_AAD   130.0 MiB took 1.000 seconds, 130.000 MiB/s Cycles per byte = 24.95
AES-256-GCM-dec-no_AAD   135.0 MiB took 1.000 seconds, 135.000 MiB/s Cycles per byte = 24.80
GMAC Table 4-bit         360.0 MiB took 1.000 seconds, 360.000 MiB/s Cycles per byte = 8.77
CHACHA                   480.0 MiB took 1.000 seconds, 480.000 MiB/s Cycles per byte = 6.75
CHA-POLY                 375.0 MiB took 1.000 seconds, 375.000 MiB/s Cycles per byte = 8.67
POLY1305                 1730.0 MiB took 1.000 seconds, 1730.000 MiB/s Cycles per byte = 1.87
SHA                      545.0 MiB took 1.000 seconds, 545.000 MiB/s Cycles per byte = 5.97
SHA-224                  240.0 MiB took 1.000 seconds, 240.000 MiB/s Cycles per byte = 13.56
SHA-256                  225.0 MiB took 1.000 seconds, 225.000 MiB/s Cycles per byte = 14.62
SHA-384                  325.0 MiB took 1.000 seconds, 325.000 MiB/s Cycles per byte = 9.90
SHA-512                  335.0 MiB took 1.000 seconds, 335.000 MiB/s Cycles per byte = 9.73
SHA-512/224              305.0 MiB took 1.000 seconds, 305.000 MiB/s Cycles per byte = 10.59
SHA-512/256              315.0 MiB took 1.000 seconds, 315.000 MiB/s Cycles per byte = 10.36
SHA3-224                 285.0 MiB took 1.000 seconds, 285.000 MiB/s Cycles per byte = 11.43
SHA3-256                 280.0 MiB took 1.000 seconds, 280.000 MiB/s Cycles per byte = 11.56
SHA3-384                 190.0 MiB took 1.000 seconds, 190.000 MiB/s Cycles per byte = 17.04
SHA3-512                 155.0 MiB took 1.000 seconds, 155.000 MiB/s Cycles per byte = 21.42
HMAC-SHA                 500.0 MiB took 1.000 seconds, 500.000 MiB/s Cycles per byte = 6.37
HMAC-SHA224              230.0 MiB took 1.000 seconds, 230.000 MiB/s Cycles per byte = 14.36
HMAC-SHA256              220.0 MiB took 1.000 seconds, 220.000 MiB/s Cycles per byte = 14.72
HMAC-SHA384              310.0 MiB took 1.000 seconds, 310.000 MiB/s Cycles per byte = 10.37
HMAC-SHA512              305.0 MiB took 1.000 seconds, 305.000 MiB/s Cycles per byte = 10.63
PBKDF2                   28.0 KiB took 1.000 seconds, 28.438 KiB/s Cycles per byte = 116734.97
RSA     2048   public     14700 ops took 1.000 sec, avg 0.068 ms, 14700.000 ops/sec, 3418517598 cycles 232552.2 Cycles/op
RSA     2048  private       200 ops took 1.000 sec, avg 5.000 ms, 200.000 ops/sec, 3977612316 cycles 19888061.6 Cycles/op
DH      2048  key gen      1071 ops took 1.000 sec, avg 0.934 ms, 1071.000 ops/sec, 2825190480 cycles 2637899.6 Cycles/op
DH      2048    agree       700 ops took 1.000 sec, avg 1.429 ms, 700.000 ops/sec, 3816485350 cycles 5452121.9 Cycles/op
Benchmark complete

Misc

--enable-sp-asm with --enable-ecc is not supported.

@philljj philljj self-assigned this Jan 26, 2026
@philljj
Copy link
Contributor Author

philljj commented Jan 28, 2026

Retest this please.

(several failures because network).

@philljj philljj requested a review from douzzer January 28, 2026 18:12
@philljj
Copy link
Contributor Author

philljj commented Jan 28, 2026

Retest this please.

(tests completed but then hung?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant